arXiv:1501.02959vl [quant-ph] 13 Jan 2015 


Strong experimental guarantees in ultrafast quantum random number generation 

Morgan W. Mitchell,^’Carlos Abelian,^ and Waldimar Amaya^ 

^ ICFO-Institut de Ciencies Fotoniques, Av. Carl Friedrich Gauss, 3, 08860 Castelldefels, Barcelona, Spain. 

^ICREA-Institucio Catalana de Recerca i Estudis Avangats, 08015 Barcelona, Spain 

(Dated: January 14, 2015) 

We describe a methodology and standard of proof for experimental claims of quantum random 
number generation (QRNG), analogous to well-established methods from precision measurement. 

For appropriately constructed physical implementations, lower bounds on the quantum contribu¬ 
tion to the average min-entropy can be derived from measurements on the QRNG output. Given 
these bounds, randomness extractors allow generation of nearly perfect “e-random” bit streams. An 
analysis of experimental uncertainties then gives experimentally derived confidence levels on the e 
randomness of these sequences. We demonstrate the methodology by application to phase-diffusion 
QRNG, driven by spontaneous emission as a trusted randomness source. All other factors, includ¬ 
ing classical phase noise, amplitude fluctuations, digitization errors and correlations due to finite 
detection bandwidth, are treated with paranoid caution, i.e., assuming the worst possible behav¬ 
iors consistent with observations. A data-constrained numerical optimization of the distribution of 
untrusted parameters is used to lower bound the average min-entropy. Under this paranoid anal¬ 
ysis, the QRNG remains efficient, generating at least 2.3 quantum random bits per symbol with 
8-bit digitization and at least 0.83 quantum random bits per symbol with binary digitization, at 
a confidence level of 0.99993. The result demonstrates ultra-fast QRNG with strong experimental 
guarantees. 


I. INTRODUCTION 

Quantum random number generation extracts ran¬ 
domness from quantum mechanical processes and mea¬ 
surements. Processes used have included radioactive de¬ 
cay m, path-splitting of single photons [5] , photon num¬ 
ber path entanglement [3], amplified spontaneous emis¬ 
sion [3], measurement of the phase noise of a laser 0- 
|S], photon arrival time [S], vacuum-seeded bistable pro¬ 
cesses m and stimulated Raman scattering m- Quan¬ 
tum random number generators are attractive because 
their randomness can be linked to well-tested principles 
of quantum mechanics, e.g. the uncertainty principle 
[12j . which guarantees a minimum amount of random¬ 
ness in some physical quantities. 

Physics plays an essential role in QRNG, not only at 
the generation stage, but also when making claims of 
randomness. While it is common to test generated data 
against statistical test suites [l3], these tests can only 
identify nonrandomness, i.e., patterns in the output. For 
fundamental reasons, statistical tests cannot confirm ran¬ 
domness of finite sequences [14]. In contrast, physical 
models can support a randomness claim, as we describe 
in this work. 

Trust plays a central role in contemporary discussions 
of QRNG, as it does in quantum cryptography. Gryp- 
tography employs trust models that define what parts of 
a communication system are assumed to be understood, 
in contrast to those that could be under the control of 
an adversary. A strategy that trusts fewer parts of the 
system places a lower burden on verification. In an ex¬ 
treme of paranoia, “device-independent” (DI) strategies 
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distrust even the measurement devices employed by the 
communicating parties . The DI approach aims to 

provide security against hardware-based attacks |2D| , and 
some progress toward DI QRNG has been demonstrated 

m- 

It is important to note that DI techniques aim to 
guarantee considerably more than randomness. They 
use loophole-free Bell inequality violations m, or other 
evidence for nonlocality da El], in conjunction with 
monogamy relations and the no-signaling principle to 
guarantee that no other actor could be in possession of 
a copy of the generated random numbers. This guaran¬ 
tee has obvious security value and explains much of the 
interest in DI quantum key distribution and DI QRNG. 
In practice, however, loophole-free Bell inequality viola¬ 
tions are experimentally difficult, and the demonstrated 
rates are very low. A heroic experiment that still left 
open the timing loophole produced 42 random bits in 
1 month dl], 15 orders of magnitude slower than other 
techniques Eng. For the foreseeable future, practical 
use of QRNGs will require verification. Moreover, many 
randomness applications, e.g. Monte Garlo simulations, 
have no reason to protect themselves against informa¬ 
tion leakage and obtain no benefit from the additional 
security of the DI approach. 

Nearly all experimental claims of QRNG to date im¬ 
plicitly or explicitly assume nonadversarial devices, with 
varying degrees of trust in their sources dill 0011- 
EaUMlH]. To take the best-known example, splitting 
a single photon on an ideal 50:50 beam-splitter gives a 
random direction to the photon, and this direction can 
be measured to give one perfectly random bit. DI-grade 
paranoia is not practical in this scenario; if the beam¬ 
splitter transmission were under the control of an ad¬ 
versary, she could determine every outcome. It is thus 
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necessary to verify the performance of the device. Un¬ 
fortunately, most QRNG claims, indeed all that we are 
aware of, leave important gaps in the verification. In the 
beam-splitter example, a variety of classical effects could 
steer the outcome: correlations in the photon source, in¬ 
efficiency in the detectors, light entering the unused port 
of the interferometer, sensitivity of the beam-splitter to 
polarization, frequency, beam position, beam direction, 
or any other variable that might fluctuate in the light 
source, to name a few. Some of these effects, e.g. variable 
detector efficiency [5SJ[5n] , have been accounted for, while 
others have not. For continuous-variable (CV) QRNGs, 
a category that includes the fastest devices, the account¬ 
ing for noise and detection bandwidth has to date been 
unrealistically optimistic. For example, it is often as¬ 
sumed that digitization noise is independent of the quan¬ 
tum noise being digitized [Zl[3l] or that detection systems 
introduce no correlations [251I32]- As we show in Sec. 
these assumptions are unwarranted in real systems. Con¬ 
cerning analysis, only a few experimental works [ZIIHIIII] 
quantify their performance using measures compatible 
with modern randomness extraction (see Sec. 0- 

We propose a standard of proof for quality assurance 
in QRNG, between the paralyzing “trust-nothing” para¬ 
noia of the DI approach and the risky insouciance of 
most QRNG demonstrations to date. We refer to this 
as metrology-grade paranoia. The name notes the simi¬ 
larity of the verification required for characterization of a 
QRNG and the verification required to make a precision 
measurement. Both practices assume that the system is 
fundamentally understandable, but take a conservative 
and rigorous approach to calibration and experimental 
imperfections, i.e., to systematic errors. A modern preci¬ 
sion measurement, e.g., of the transition frequency in an 
atomic clock, will take into account a large variety of pos¬ 
sible systematic errors and give a quantitative estimation 
of their effect on the measurement result [33J [31] . Both 
approaches burden the experimenter with understanding 
and quantifying all relevant aspects of their system. The 
success of similar approaches in precision measurement 
reassures us that this burden is not unbearable. 

We apply our approach to phase-diffusion QRNG 013, 
the fastest reported QRNG approach 0132]. We show 
that the statistics of the measured output provide lower 
bounds on the amount of quantum randomness contained 
in the data stream, allowing the generation of e-random 
sequences and the assignation of confidence levels to the 
purity of the randomness. We find that the claims for 
pulsed phase diffusion survive metrology-grade paranoia, 
and thus it is possible to have simultaneously a very high 
bit rate and strong randomness assurance in a practical 
system. 


II. RANDOMNESS QUANTIFICATION 

A perfect physical device is not required for near¬ 
perfect randomness generation. Algorithms known as 
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FIG. 1. (Color online) (Top) Schematic of phase-diffusion 
QRNG. A single-mode diode laser is strongly current modu¬ 
lated to produce a train of phase-randomized output pulses 
with field strengths E(t). Interference of subsequent pulses 
is performed with a Mach-Zehnder interferometer, consisting 
of single-mode 2x2 couplers (cpl) and a relative delay equal 
to the pulse-repetition period r. A photodiode (PD) converts 
the output pulse powers into electrical current, which is am¬ 
plified (amp) and converted to digital values with a digitizer 
(dig). Either arm of the MZI can be broken to measure the 
pulse amplitude in the other arm. (Middle) Time domain 
recording of a short digitized sequence of p*-'!, the interferom¬ 
eter output with interference (top, blue), and p^®^ (middle, 
red), and p^*^ (bottom, beige), the outputs of the interfer¬ 
ometer with only the short or long path open, respectively. 
Data have been shifted to have equal baselines. (Bottom) 
Histograms (scaled for equal height) for p^‘^ (wide, blue), p^'*^ 
(left narrow, red), and p^*^ (right narrow, beige). The wide 
p*-'^ distribution arises from interference and resembles the 
arcsine distribution that describes cos(/) when cj) is uniformly 
distributed. 


randomness extractors (REs) [T3] |3S] convert partly ran¬ 
dom data into nearly perfect “e-random” bit strings by 
a hashing process [36]. If d is a random symbol with 
probability distribution P{d), then V = max^ P{d,) is the 
predictability, and Hao = — log 2 V is the min-entropy. 
Information-theoretically provable REs 0137] can pro- 
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FIG. 2. (Color online) Measured digitization error frequen¬ 
cies and error limits. Color indicates relative frequency from 
zero (black) to maximum (white). It is interesting to note 
the presence of both a large-scale nonlinearity in the con¬ 
version (the general trend) and small-scale regularities (e.g. 
the period-two patterns clearly visible between 50 and 60). 
Green traces above and below indicate the largest and small¬ 
est errors observed, respectively. Approximately 2^"^ samples 
per digitization value were used to obtain the frequencies, so 
the confidence that a new event will fall within the limits is 
« 1-2"^^. 

duce e-random output bit strings with a length given by 
their input min-entropy. 

Real devices do not operate under constant conditions, 
and it is necessary to accommodate the possibility that 
a QRNG is at some moments producing higher-quality 
randomness than at other moments. We can describe 
this situation saying the symbol d has a probability dis¬ 
tribution P{d\x), where x describes the condition of the 
source when d is produced. Although x may vary, it is not 
a source of true randomness. It describes parameters not 
trusted to be random; for example, the x variation may 
be deterministic but unknown to us. We consider the ran¬ 
domness quantification from the perspective of someone, 
perhaps an adversary, who knows x. Because x includes 
all of the untrusted variables, and because the trusted 
variables are independent, subsequent d are independent, 
in the sense that the probability P({d}|{x}) of generat¬ 
ing a string of output symbols {d} = (di,..., djy) under 
conditions {x} = (xi,...,X 7 v) is given by the product 
P({d}|{x}) = IliP{di\xi). The conditional min-entropy 
of {d} is then 

■f^oo({d}|{x}) = -log2minP({d}) = ^idoo(di|Xj) 

{d} . 

. . ( 1 ) 

where idoo(d|x) = — log 2 min^j P(d|x) is the conditional 
min-entropy of a single symbol generated with conditions 
X. Note that idoo({d}|{x}) does not depend on the order 
of the elements of {x}, so that a knowledge of the relative 
frequencies Pi.ei(x) with which the conditions x appear 
in {x} is sufficient to compute the mean min-entropy per 
symbol, 

Idoo = J dxPi.ei(x)idoo(d|x). (2) 

As we shall see, a measured string {d}, combined with a 
model of how x and trusted randomness interact in the 


source to produce d, constrain Pi.ei(x), and thus provide 
a bound on Hao for that string. In this way, randomness 
guarantees, with no prior assumptions about {x}, can be 
generated, at the cost of analyzing each raw string {d}. 

If we allow ourselves to assume that the conditions {x} 
are independent random variables |38] , it suffices to char¬ 
acterize P(x), the distribution of x, rather than Pi.ei(x), 
the relative frequencies that actually occur. REs adapted 
to this probabilistic situation [ssiiin] give e-random out¬ 
put with length limited by the average min-entropy^ de¬ 
fined as 

idoo = —log 2 y dxP(x) maxP(d|x). (3) 

Note the difference relative to Eq. ([^; here the logarithm 
is outside of the average. This reduces the entropy, so 
that for P(x) = Pi.ei(x), idoo < Hoc- As with Pi.ei(x) and 
idoo, P(x) and Hoc can be bounded using knowledge of 
a measured string {d}, but this calculation only needs to 
be performed once, and can be performed with a very 
long string {d}, to precisely estimate P(x). 

In what follows, we work with P(x) and ddooj the more 
conservative of the two entropy measures, although the 
same methods can be applied to Prei(x) and idoo- 


III. METHODOLOGY 

In principle, the prescription for metrology-grade para¬ 
noia is simple. First, describe the process by which a 
quantum random variable, in our case the laser phase 
diffusion due to spontaneous emission, and other exper¬ 
imental variables x combine to produces measurement 
results d. Second, use the distribution of known from 
first principles or from modeling, to calculate P(d|x), the 
distribution of symbols d, conditioned on x. Third, find 
ddoo, the lowest value of Hao that is consistent with what 
is known about x, i.e., with experimental or theoretical 
constraints on d^(x), the distribution of x. 

Knowing ddop , a RE can then be used to produce an 
e-random bit string, with length « NHao, where N is the 
number of symbols in the raw data string. Confidence in 
the randomness of this bit string derives from the con¬ 
fidence in T’(x). For example, if statistical and system¬ 
atic uncertainties give 99% confidence that the process 
produced at least Jdpo average min-entropy, then the ex¬ 
tracted bit string is e-random with at least that same 
confidence level. 

The consistency condition is an invitation to paranoia. 
For example, it has sometimes been assumed in QRNG 
work that digitization errors are independent of the quan¬ 
tum signal being digitized, and simply add entropy to the 
raw data, an entropy that is not of quantum origin and 
must be accounted for in order to not overestimate the 
quantum entropy, but is otherwise harmless. But is this 
really the case ? How can one be sure that the noise 
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added by the digitizer is independent of the signal ? Un¬ 
less one possesses specific knowledge about this charac¬ 
teristic of the digitizer in question, one must admit that 
our knowledge is consistent with less favorable scenarios 
[25j . For example, the digitizer might organize its errors 
to bias the results toward one subset of possible symbols, 
reducing the entropy and in effect consuming some of the 
quantum randomness present. A paranoid analysis must 
assume this is indeed happening, and in the way that 
reduces i?oo as much as possible. 

To show that this methodology can be used in practice, 
we perform this analysis on a phase-diffusion QRNG, of 
the same design as [8]. 

IV. MODEL 

We start with the model shown in Fig. (top), cor¬ 
responding to lilHl. A single-mode diode laser is driven 
with a strongly modulated injection current with period 
T. For all data shown in this work, r = 5 ns. The optical 
output of the laser, described by the field E{t), is fed to 
the input of an unbalanced Mach-Zehnder interfereome- 
ter (MZI), with short and long delays Tg and ti = Ts+ t, 
respectively. The field exiting the MZI is 

Eiit) = TgEit-Tg)+riE{t-Ti), ( 4 ) 

where Tg and Ti are the transmission coefficients, in¬ 
cluding both couplers, for the short and long paths, re¬ 
spectively. A photodiode converts the incident power, 
p^'\t) = \Ei(t)\^, into a current, which is amplified and 
digitized at times = ir, f = 1, 2,... with the time origin 
chosen near the peak of the pulse. Due to strong phase- 
diffusion between times ti and ti+i, the detected signal 
shows a strong variation that is not present in the input 
pulses. This is illustrated in Fig. [^(middle), which shows 
digitized signals, both from the complete MZI with inter¬ 
ference, and from the MZI with either arm interrupted. 
Histograms of the resulting interference and single-path 
signals are shown in Fig. (bottom). 

The phase between pulses contains a quantum contri¬ 
bution as well as a classical contribution due 
to relative phase of the interferometer arms, as well as 
classical fluctuations in laser parameters such as injec¬ 
tion current. As described in the Appendix, quantum 
theory of laser dynamics mull] predicts that is in¬ 
dependently distributed from one pulse to the next, with 
a Gaussian probability density function (PDF) P(0('^^) 
of rms width aq. We keep aq as a parameter, in order to 
study its effect on randomness generation. Writing the 
total phase {t) + (j)^'^\t) = a,TgE{t — Tg) — a.rgE{t — Ti) 
and suppressing time dependencies for clarity, the optical 
signal, i.e., the instantaneous power, is 

-I- p^^^ -I- 2V\/p(®)plb cos(^^‘^^ -|- (5) 

where p^^HO = \TgE{t - Ts)|2, pT{t) = \TiE{t - r/)|2, 
and V{t) is the interference visibility. We assume the 


photodetection and amplification process is linear and 
stationary, so the electrical signal arriving to the digitizer 
is 

V{t)= f dt' G{t - t')p^''> {t') + (t) (6) 


where G is the impulse response of the detector-amplifier- 
digitizer system and is the summed electronic noise 
from all sources. Finally, the digitizer converts U to a 
digital value d. Digitization is a highly nonlinear process, 
and requires special care, as we now describe. 


V. DIGITIZATION 

Fig. 0 (bottom) illustrates a feature of digitization. 
This process adds classical noise, e.g. from the ampli¬ 
fication, and moreover employs a highly nonlinear elec¬ 
tronic operation to convert a continuum of inputs p^‘^ 
into a finite set of outputs d. Although it may be tempt¬ 
ing to assume that errors in this process are independent 
of p*^‘^ (as is typically the case for amplifier noise), this 
is clearly untrue for digitization noise. For example, a 
digitizer will normally have a measurable preference for 
even versus odd outputs [13], something that would not 
occur if errors were independent of the input. In Fig. 
an oscillation in the histogram frequencies with period 4 
is clearly visible, with an amplitude that is modulated 
with a period of 16. These errors have an rms width of 
0.8 codes, i.e., increments of the digitizer output, when 
averaged over all d, and are clearly not independent of 

p^'\ 

We experimentally bound the size of digitization er¬ 
rors as follows. We use an electronic function generator 
(Tabor WW1281A) followed by a low-pass filter to pro¬ 
duce a quasistatic voltage (a 1-kHz triangle wave) and 
digitize this signal with our fast 8-bit digitizer (Acqiris 
U1084A) and simultaneously with a 14-bit oscilloscope 
(Agilent infiniium 86100G with an electronic module Ag¬ 
ilent 86112A) for reference. Fig. shows the distribu¬ 
tion of digitization errors, i.e., of the deviation of the 
digitized value from the ideal value, based on « 2^^ sam¬ 
ples per digitization value. This allows us to identify 
limits and the minimum and maximum 

voltages, respectively, that were observed to produce a 
given digitization value d. Below, to compute a lower 
bound on in the presence of digitization errors, we 
assume that digitization results outside of these limits 
are so improbable as to have a negligible effect on . 
We note that electronic noise during the characterization 
measurements, e.g., in the voltage source or in the refer¬ 
ence oscilloscope, can only broaden these bounds, making 
them conservative. 
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FIG. 3. (Color online) Normalized correlation and recovered 
impulse response. The main graph shows autocorrelation acA 
computed on a string of 10® symbols. Open blue (solid red) 
circles indicate positive (negative) correlation. The horizontal 
line shows sampling uncertainty. The inset shows the recon¬ 
structed impulse response function Gj, as described in the 
text. 


VI. FINITE BANDWIDTH 

Fig. a (middle) illustrates something intrinsic to ana¬ 
log randomness generators. An ideal physical process 
would produce independent random values, but this is 
impossible in a real system due to bandwidth limitations. 
When a digital sample is taken, the detection system is 
still responding (possibly weakly) to analog inputs it re¬ 
ceived at earlier times. This is evident in the upper trace 
of Fig. which visibly shows electronic ringing and does 
not fully return to baseline after a strong pulse. 

We model this behavior using Eq. §, but considering 
only the sampling times t = and write V) = 

V{ti), Gj = G{tj), etc.. 


OO 

= (7) 

j=o 

We compute the autocorrelation acA = cov{Vi,Vij^\) — 
J2jk^j^kCov{pi_j,Pi+A-k) = va,T{p)J2jGjGj+A, plus 

a contribution from and we have assumed 

cov{pi,pj) = var(p)dy. For our system, the con¬ 

tribution is negligible: var(F) places an upper bound on 
var(Fi®*)) for any input power p. Yet, if we interrupt 
one arm of the interferometer, we observe nearly con¬ 
stant signals V, as shown in Fig. with variance 39 dB 
below the variance of the interference signal. Because 
acA can be directly measured from the data, we have 
an experimental determination of acA = GjGj+a, 
the autocorrelation of the impulse response. Consider¬ 
ing that Go Gj^o, and using the causality condi¬ 
tion Gy<o = 0, we find Gj perturbatively as follows. 
We write Gj = where A is a parameter 

that later is set to unity, and define the cross correlation 


ccb..™) ^ We write 


acA = A° cc^’°^ -I- A^ 


cc 


( 0 , 1 ) 


cc 


( 1 . 0 ) 

A 


+A 2 


cc 


( 0 , 2 ) 

A 


cc 


( 1 . 1 ) 


cc 


( 2 . 0 ) 


( 8 ) 


cc 


and solve by orders in A from the starting condition 
Gj°^ oc do.j- Considering the A° contribution we find 
0 Ca’°^ = [Gg°^]^(5o,A giving the A° solution [Gq°^]^ = aco. 
Without loss of generality we take Gg°^ to be posi¬ 
tive. Considering then A^^^ we solve acA = cc^’^^ + 
A + cCa’°^ , a linear equation for G^^\ by matrix 
inversion. Continuing in a similar fashion for higher or¬ 
ders in A, Gj rapidly converges to give the impulse re¬ 
sponse shown in Fig. Considering the low degree of 
observed correlation, it is not surprising that this resem¬ 
bles the correlation acA and is dominated by the A = 0 
term. It is perhaps interesting to note the narrow neg¬ 
ative feature at A = 10, probably due to an electronic 
reflection in the cabling of the digitization electronics. 


The net contribution of previous pulses is = 

jyjJ-ooPjGi-j- This contributes to the variance of in¬ 
dividual Vi without adding any randomness to the se¬ 
quence. From the di sequence we find bounds C.- = 
mini = —0.0145 full scale, or -3.7 codes at 8 -bit 

resolution, (+ = maxi = 0.0156 full scale, or -(-4.0 

codes at 8 -bit resolution. We refer to C- and C-i- as “hang¬ 
over errors” for their delayed nature. 


VII. REFINEMENT OF THE PROBLEM 

Having established a model for the device, we now 
ask the following: Trusting only be random, how 

much randomness exists in the output string ? In par¬ 
ticular, we do not trust V, F*^®*\ or F^p"'®''^ 

to be random. Fluctuations in these quantities can be 
traced to fluctuations of classical variables, for example, 
the injection current of the diode, that certainly contain 
patterns, and that could, in principle, be described by a 
perfectly deterministic pattern unknown to us. We are 
not, however, completely ignorant about these quantities; 
their distributions are constrained by the digitization and 
correlation measurements described above, as well as by 
the distributions of d^^\ and d^^\ 

A key observation is illustrated by Fig. (bottom). 
The distributions of and are very narrow, whereas 
the distribution of d^'^ is broad. Provided the digitization 
gives a not-too-unfaithful conversion from p to d, we con¬ 
clude that varies much more than or p*^^). By Eq. 
(§, this implies V ^ 0 , at least for some fraction of the 
measured pulses. V 7 ^ 0 in turn means that p^''> (and thus 
d*-'^) contain some randomness from Our goal is to 
make quantitative this observation, to put lower bounds 
on the quantum randomness of the string {d|’^}. 
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FIG. 4. (Color online) Illustration of the distribution func¬ 
tion |x) that characterizes given by Eq. (j^ for 

fixed , and normally distributed . (Left) Vi¬ 

sualization of the calculation. Gaussian P{4>^°^'^) (radial co¬ 
ordinate) centered at (polar coordinate), has probabil¬ 
ity mass (green area) given by the error function between 
limits given by the arccosine of the scaled and shifted p*-‘^ 
(horizontal coordinate). (Right) Illustration of Eo-, (p*‘^ |x) for 

p{s) _ p(l) _ y _ Q 0 (c) _ ty/S, . . . , TT, 

from left to right. 


VIII. DIGITIZATION LIMITS 


IX. POSSIBLE DISTRIBUTIONS 

For given x = and with nor¬ 

mally distributed with mean zero and rms width Cg, we 
can compute (p^*) |x), the CDF for as follows. We 
note the transformation of variables rule: If F = f{^)j 
where / is a differentiable function and A is a random 
variable with distribution Px{X), then the distribution 
of F is 


Py{Y) = Y1 

i 


d 

dF 


fr\y) 


Px{f-\Y)) 


(11) 


where /, ^(F) indicates the i’th root of the equation 
f{X) = Y. Applied to Eq. <§ and integrating to find 
from Pp(i){p^'^), we find 


CDf,.(p®|x) = l-l f 


2x71 


n— — oo 


V2 


det = arccos ■ 


jj(i) — 

2V\/pWp(iy 


4 *—^det 

( 1 ?) 

4 ~ ^det 

(13) 


An ideal digitization process would output the value 
d G [0, A^ — 1] for inputs in the range p G [p^, 
where 


(ideal) 

Pd- 


—00 d = 0 
d otherwise 


(9) 


(ideal) _ 

Pd,+ - 


oo d = N-l 
d + 1 otherwise 


( 10 ) 


We have seen, however, that our digitizer sometimes 
makes errors; i.e. it outputs a value d when p ^ 
j^bdeai)^^qd^ai))^ The distribution of these errors is il¬ 
lustrated in Fig. and can be roughly characterized 
by the rms width « 0.8 codes. Defining and 

as the minimum and maximum inputs, respec¬ 
tively, that are seen to give rise to an output d, we can 
say with confidence that an output d implies an input 
P G I P^d+'^)■ This also allows us to bound the prob¬ 
ability P{d) of an output d. Given a cumulative distri¬ 
bution function (CDF) F{p) for the input, the output 
satisfies P(d) < P(pi)^+®^) - 


We can include also errors due to finite bandwidth in 
this description. If the minimum and maximum hang¬ 
over are C- and ()+, respectively (cf. Sec. VI), then 
a value d implies p G Pd'^))^^), where Pd ±= 


P^d±^ + C± (ll^e superscript indicates the combined 
effects of digitization and hangover errors). These digiti¬ 
zation limits including hangover will be used to evaluate 
digitization of the strongly varying signal p^9 ^ while the 
limits without hangover will be used for the weakly vary¬ 
ing p(®) and p91, for which the hangover error is negligible. 


where erf is the error function. This result is illustrated 
in Fig. 1^ The CDF has the usual interpretation: The 
probability to find p^9 an interval [a, b) is F^ (&|x) — 
Pa,(a|x). 

We are also interested in the case where 
is completely uncertain, or equivalently uniformly dis¬ 
tributed on [0, 27r). This gives 


Fo(p(‘)|x) = l- 


—Re [arccos 

TT 


— pi^) — 


], (14) 


which not surprisingly is the cr^ —)■ oo limit of Per, (p*'*^ |x). 

Finally, for the non-interfering signals p*-®^ and p91, the 
relevant CDF is 


T'|.s(p|x) = 6i(p-p^"^), (15) 

P|,/(p|x) = 6i(p-p9)), (16) 

where 9 is the Heaviside step function. Given a CDF 
P(P|x) and a distribution P(x) for x, the statistically 
averaged CDF is 

F(p) = y d'‘xP(p|x)P(x). (17) 

Thep*^9 digitization frequencies of Fig. [^were collected 
with varying due to thermal expansion of the fiber 
loop in the MZI, and probably several other factors. This 
causes a drift by much more than 27r over the time of the 
acquisition, so it is appropriate to compare the p*^9 data 
against Fo{p^'^), which incorporates the averaging. 
If we write P^'^d), P9)(d), and P^'^d) for the probabil¬ 
ities of digitization outcome d when measuring variable 
p(s) ^ ^,(1) ^ y(i) ^ respectively, then the probability of an 
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outcome in the range Z to h is W and 

similar for and P^'l- P/’/| is upper bounded by 

PlI < PoipiT^) - p<^iPiT^)^ ( 18 ) 

where ) is the range, including errors as 

described above, of the digitization outcome d. We 
can also obtain a lower bound, considering that Pi^h = 
1 — Po,i-i — Ph+i,N-i, and that the latter two terms are 
upper bounded as above. We find 

pS > - FoU-t'i)- (19) 


As both P((i)(‘)and the limits pd,-,Pd,+ have been mea¬ 
sured, Eqs. ( |18[ ) and (191 provide experimental con¬ 
straints on P(xp 

Analogous constraints apply to the noninterfering sig¬ 
nals 


PS<F\Apiy)-F\APL-^^) ( 20 ) 

pS > P\Api%i-) - PiAp't't) ( 21 ) 

and similar for P/2- 

X. RANDOMNESS QUANTIFICATION REDUX 



FIG. 5. (Color online) Optimized piecewise-constant distribu¬ 
tion P(x) for 8-bit digitization and ct, = 37 r/2. Axes indicate 
pP\ and V; density indicates Si. is aot included as 
an independent dimension because it is chosen according to 
other criteria (see text). The ranges of and p^-'i are cho¬ 
sen to cover the whole range of these variables allowed by the 
measured distributions shown in Fig. in light of digitization 
errors from Fig. The graphic on the left uses worst-case 
errors (green curves in Fig. [^; the one on the right uses error 
limits narrower by a factor 0.275. Within these ranges, the 
space is divided into a uniform 8 x 8 x 32 rectangular grid {Ci}i 
and corresponding weights {si} are calculated by numerical 
minimization of the min-entropy lower bound as in Sec. |XI| 
The probability is concentrated in regions of high visibility, 
necessary to agree with the wide measured distribution, and 
regions of low visibility, which give low min-entropy. The dis¬ 
tributions ofp^'^,p*'^\ andp^*^ that follow from these P(x) are 
shown in Fig. 


We now find a lower bound for Poo as in Sec. but 
including worst-case considerations for digitization and 
hangover errors. As above, we first consider a given 
X, implying a given Pcr^ |x). Inclusion of digitization 
and correlation errors leads to the upper bound 

pW(d|x) < P<,^(pd,-|x) - P^^(pd,+ |x). (22) 

In contrast to p*-®\ p^*\ and V, which are more-or-less 
directly reflected in {dj} and thus have distributions con¬ 
strained by, e.g., Eq. (18), we have little measured infor¬ 
mation about ■ To be conservative, we maximize the 
right-hand side over this variable to find the “worst-case” 
(wc) bounds 


pW(fi|x) < max [F„APd-\A “ F^PdAA] 
(pA) 

= p("'=)(d|x). 


(23) 


Now maxdP('^'=)(d|x) upper bounds the predictability of 
a single symbol, produced with a given x. For a string 
of symbols, generated as x varies with distribution P(x), 
the average min-entropy is lower bounded by Eq. ap¬ 
plied to P("'“)(d|x): 




- log 2 J dx P(x) max (d|x) = H, 


(wc,P(x)) 

oo 


( 24 ) 


XI. OPTIMIZATION 


1 . ... . 1 1 

Uur goal IS now to minimize P ^ , or equivalently 

to maximize 

= y dxP(x)maxP('^“)(d|x) (25) 

by choice of P(x), subject to constraints as in Eqs. 
(18)-(2I). This will give a conservative estimate of con¬ 
tribution of to the min-entropy in the digitized 
bit string. We transform this into a linear program¬ 
ming problem by splitting the x space into a cover¬ 
ing by nonoverlapping regions {xi\- If RxiA) = 1 
for x € Xi and zero otherwise, then the probability to 
find x€ y, is s,- = f PxPv (x)P(x). By assumption 
/ Ax Px. (x)Px, (x) = 0 for t p j. 


Inserting the identity EQ- (25) we find 


= y d^x ^P^,(x)P(x)maxP('^“)(d|x) (26) 

i 

< (x)P(x) maxmaxP^'^'^^(d|x027) 

xGXi d 

(28) 


= Si max max p(’"“) (dlx) 
^ xSXi d 

I 

_ p(wc,{si,Xi}) 


(29) 


As described below, the maximization over x G in Eq. 
(27) makes the coarse-graining procedure conservative. 












I — . . 

0 50 100 150 200 250 

bin number 


FIG. 6. (Color online) Comparison of measured frequencies 
against their most conservative interpretation, the prediction 
from the optimized F’(x). (Top) Prediction from P(x) of 
Fig. (left), assuming worst-case tolerances. (Bottom) Pre¬ 
diction from P(x) of Fig.(right), assuming tolerances 0.275 
of worst case. The main graph shows a histogram of ob¬ 
served (jagged blue), and vertically offset (inset, 

left blue and right red), the same as in Fig. Superposed 

smooth green curves show the predicted distribution P(p^'^) 
computed from P(x) chosen to minimize H'^^' . The in¬ 

set shows, inverted, the predicted distributions for and 
p^*\ The predicted distributions are consistent with the ob¬ 
served data in light of the tolerances provided by digitization 
and hangover errors (see Secs. [V| and [VT| ) . Note the central 
bump, from to low-visibility parts of the distribution, that 
lowers the min-entropy. 


The probabilities Si are constrained by / d^xP(x) = 1 


or 


= 1 . 


(30) 


An additional set of constraints, also linear in the {si}, 
is generated from Eqs. (18)-(21) by applying the coarse¬ 
grained average 

P(.^) (31) 

i 

to Eq. ( [I7| ), to give 

■P'(P)(32) 



2 3 4 

RMS width of 0*''* (radians) 


FIG. 7. (Color online) Min-entropy bound as a function 
of aq for rectangular-lattice coverings of different resolution. 
Digitization is 8 bits. With n(l x 1 x 4) divisions, where 
n = 6 ,..., 12, giving the shown curves, from bottom to top. 
The inset shows the same curves on a finer scale. Increasing n 
gives an increasing lower-bound for Hoc ■ The inevitable error 
due to finite covering resolution works to reduce 77oo, making 
the estimate conservative. With n = 8 we find 1% accuracy 
relative to n = 12, the highest resolution we could optimize 
using the MATLAB function linprog and 8 GB of RAM. 


describing the various F quantities appearing in Eqs. 
(18)-(21). In what follows, the Xi are chosen to be rectan¬ 


gular regions of x space, which facilitates the necessary 
integrations. For example, / dV Fo(p('^ |x) has an ana¬ 
lytic form, reducing the number of numerical integrals. 

Having expressed the constraints and objective func¬ 
tion as linear functions of the Si, we use a large-scale 
linear programming routine to find the unique solution 

{si} that maximizes ^j^g gg^ qJ 

constraints, for a given covering {xi}- We arrive to the 
bound 


Hao > — log 2 max V 
{sil 


(wc,{si.Xi}) 




(wc,{xi}) 


(33) 


Illustrations are given in Figs. and We increase 
the resolution, i.e., increase the number of elements in 
the covering while decreasing their volumes, to reach our 

best estimate of Because the target function 

hY' is calculated using the worst point in each re¬ 


gion, as in Eq. (27), while the constraints are calculated 


using the region average, as in Eq. (31), the average 


min-entropy bound increases with increasing resolution, 
making the procedure conservative at finite resolution. 
See Fig. [^for illustration. 

The statistical analysis described here can, in principle, 
be performed on the raw data themselves, i.e., applied to 
the symbols {d} prior to randomness extraction. Fur¬ 
thermore, the analysis uses only the frequencies of the 
symbols and is independent of their order. For these rea¬ 
sons, there is no reason P(x) must be stationary in time. 
Rather, it describes the distribution of x aggregated over 
the time of the data acquisition. 















































9 



o-g, RMS width of (radians) 


FIG. 8. (Color online) Lower bonnd on min-entropy versns 
(7q for different digitization resolution, from 1 bit to 8 bit 
(bottom to top). Other conditions are: covering resolution 
= 8 X 8 X 32, “worst-case” assumptions for digi¬ 
tization and hangover errors. 


3.5h 


2.75 


2.5 


2.251: 




1 . 

0.95 

0.9 

0.85 
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0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 

error tolerance (fraction of worst-case) 


FIG. 9. (Color online) Lower bound on min-entropy ver¬ 
sus error tolerance at aq = 37r/2 and covering resolution 
{p^^\p^^\V) = 8 X 8 X 32. Hollow orange circles show 8- 
bit digitization (on left scale), filled green circles show binary 
digitization (on right scale). Error limits for a given d are 
computed using the data shown in Fig. plus the hangover 
errors ((± for digitization, and are interpolated between 
the mean and the worst-case limits by the error tolerance 
shown here on the horizontal axis. For error tolerance below 
0.275 and with this covering, no F’(x) is consistent with the 
distributions shown in Fig. 


XII. EXPERIMENTAL RESULTS 


uncertainties in the laser parameters, and statistical un¬ 
certainties in the observations, it would, in principle, be 
possible to place a confidence level on the assertion that 
aq > "^-K In this case, however, we can see no rea¬ 
sonable scenario in which the phase diffusion is so much 
slower (at least a factor of 58) than calculated; the ex¬ 
perimental results of [H] would have been dramatically 
different in that case. 

Fig. 0 shows lower bound on the 

average min-entropy, as a function of digitization res¬ 
olution. We find a lower bound of 2.3 quantum ran¬ 
dom bits per symbol with 8-bit digitizationy-, and 0.83 
quantum random bits per symbol with binary digitiza¬ 
tion. Constraints are computed as above, from the 8-bit 
characterization measurements, and we compute lower- 
resolution digitizations by splitting the range S [0,1) 
into N = 2^ equally spaced bins. We assume worst-case 
digitization and hangover errors as in Sec. [yml The re¬ 
sults show a roughly linear increase in ^ versus 

b until saturation around b — 6. This supports the in¬ 
tuitively reasonable conclusion that resolution finer than 
the scale of the digitization errors contributes little to 

^(wc,{xi}) 

^ OO 

The above results are obtained with a high degree of 
statistical confidence. As described in Sec. |Vj we use as 
our error limits the most extreme errors seen in 2^^ sam¬ 
plings for any given digitization output. We thus have a 
confidence level of 1 — 2“^"^ « 0.999939 that any given 
digitization event will be within our limits and thus is 
properly accounted for in computing the average min- 
entropy. For hangover errors, due to a larger data set, 
this confidence is ~ 1 — 10“®. It will surely be reason¬ 
able to consider less conservative error bounds for some 
applications. We define a fractional error tolerance rj 
as follows: Recall that and are the min¬ 

imum (—) and maximum (+) values that can give rise 
to a symbol d in the ideal and error-adjusted cases, re¬ 
spectively. Corresponding limits with scaled errors are 
p^d+h.r,) ^ ^phi+h) In Fig. [^we show 

versus aq for different ry, showing up to 3.5 
quantum random bits per symbol in 8-bit digitization, 
and up to 0.947 quantum random bits per symbol for 
binary digitization. 


We apply the above analysis to the QRNG described 
in [5], based on the data shown in Figs. 130 and[^ To 

apply the analysis, we need a value for Ug, which we take 
to be aq = 37r/2, well into the plateaux seen in Fig. ^ 
Previous works describing the same system mm describe 
a rapid phase diffusion, reaching aq > 37r after a diffu¬ 
sion time of 0.17 ns. Our 5-ns diffusion time is 29 times 
longer, and thus aq = 37r/2 is very conservative. The 
results of [8] are based on modeling of the laser dynamics 
(see also the Appendix), supported by direct experimen¬ 
tal observations of the pulses. By considering systematic 


XIII. CONCLUSIONS 


Establishing the randomness of data generated by a 
physical process is a vexing challenge, with important 
consequences for data security and stochastic simula¬ 
tions. While many experiments have generated data 
that in some way reflected the randomness of quantum 
physics, many applications require both full random¬ 
ness and realistic assurances of randomness. We have 
described a methodology and experimental standard of 
proof for quantum randomness, similar to the methodol- 
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ogy of precision measurement. 


Other formulations |32] have similar global properties: 


The methodology is paranoid in the sense that it as¬ 
sumes the worst case behavior for all untrusted variables. 
As in precision measurement, it is possible to place ex¬ 
perimental constraints on the behavior of these variables 
using auxiliary measurements and the generated data 
themselves. A constrained numerical optimization of the 
distribution of untrusted variables gives a lower bound 
for the average min-entropy, the measure of randomness 
appropriate to randomness extraction. This enables the 
generation of nearly perfect e-random bit strings. A con¬ 
fidence level, also paranoid, is assigned to the average 
min-entropy estimate, and thus to the e-randomness of 
the generated string. 

We apply the method to an ultrafast phase-diffusion 
QRNG, and find the system is an efficient randomness 
generator even under this paranoid analysis. The result 
shows that strong experimental guarantees can be given 
for quantum random number generators. 
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Appendix A: Phase diffusion in diode lasers 


The dynamics of a diode laser are described by a set of 
stochastic differential equations that govern the exchange 
of energy between the charge carriers (electrons) and the 
field, driven by the injection current /, with noise added 
from spontaneous emission and spontaneous loss of elec¬ 
trons. We reproduce the equations from Agrawal m- 


P = iGL/VT+p-l)P + Rsp + Fp{t) (Al) 

f (Gl - 7) + f (^2) 

2 2 1 -I- -I-p 

N = I/q-^,N-GLP/VTT]} + FN{t). (A3) 

Here P is the number of photons, </> is the phase of the 
intra-cavity held and N is the number of charge carri¬ 
ers. Fp{t),F^{t), and FV(t) are (5-correlated zero-mean 
Langevin noise terms, giving diffusion coefficients 


Dpp = Rsp, 

DnN = RspP + leN, 


£>00 = Rsp/ (4P), 

PpN — RspP-i 


Dp<j> = 0 
= 0 . 

Here Rgp is the rate of spontaneous emission, which de¬ 
pends on V, while 7 e is the decay rate of the carrier 
population. The other variables describe laser character¬ 
istics that are not important in this discussion. Note that 
all of the noise terms are traceable to two spontaneous 
processes: the spontaneous emission of photons Rsp and 
the spontaneous loss of carriers 7eA^, both of which give 
rise to ^-correlated noise. The dynamics are invariant 
under a global change of (/>. 

If we write the dynamical equation for <() as 0 = A -|- 
F^{t), we can formally integrate to hnd A(j), the change 
in (j) over one pulse cycle At/) = J dtA{t) + f dtF^{t). The 
former term is a contribution to and may depend 
on, e.g., experimental variations in the current /. In 
contrast, the latter term is the phase diffusion due 
to spontaneous emission. As the integral of white noise, 
0(q) 

is a Gaussian random variable. This conclusion is 
not sensitive to the details of the model. Rather, it is 
a consequence of our separation of the phase dynamics 
into the part driven by spontaneous emission, and 
the part driven by everything else. We do not estimate 
the amount of diffusion here, rather we leave this as a 
parameter, to study the relationship of phase diffusion 
to min-entropy generation, as in Eigs. n and|8] _ 

From the phase invariance of Eqs. (All - (A31, sub¬ 
sequent realizations of are independent. The phase 
invariance is a possible weakness or point of attack on 
the implementation. If an adversary could introduce a 
coherent field at the laser frequency, they could bias the 
laser toward a chosen phase. This attack appears diffi¬ 
cult, however, as there is no optical connection to the 
outside world; all optical fibers terminate either on a 
photodetector or on an optical absorber. In addition, 
in the implementation used here, an optical isolator in¬ 
corporated into the laser package allows light to leave the 
laser, but not to enter it. 
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